590 research outputs found
The Problem Of Grue Isn't
The so-called problem of grue was introduced by Nelson Goodman in 1954 as a
"riddle" about induction, a riddle which has been widely thought to cast doubt
on the validity and rationality of induction. That unnecessary doubt in turn is
partly responsible for the reluctance to adopt the view that probability is
part of logic. Several authors have pointed out deficiencies in grue;
nevertheless, the "problem" still excites. Here, adapted from Groarke, is
presented the basis of grue, along with another simple demonstration that the
"problem" makes no sense and is brought about by a misunderstanding of
causation
Broccoli Reduces The Risk of Splenetic Fever! The use of induction and falsifiability in statistics and model selection
The title, a headline, and a typical one, from a newspaper's "Health &
Wellness" section, usually written by a reporter who has just read a medical
journal, can only be the result of an inductive argument, which is an argument
from known contingent premisses to the unknown. What are the premisses and what
is unknown for this headline and what does it mean to statistics?
The importance--and rationality--of inductive arguments and their relation to
the frequently invoked, but widely and poorly misunderstood, notion of
`falsifiability' are explained in the context of statistical model selection.
No probability model can be falsified, and no hope for model buidling should be
sought in that concept.Comment: 20 page
The Third Way Of Probability & Statistics: Beyond Testing and Estimation To Importance, Relevance, and Skill
There is a third way of implementing probability models and practicing. This
is to answer questions put in terms of observables. This eliminates frequentist
hypothesis testing and Bayes factors and it also eliminates parameter
estimation. The Third Way is the logical probability approach, which is to make
statements about observables of interest taking
values , given probative data , past observations (when present) and
some model (possibly deduced) . Significance and the false idea that
probability models show causality are no more, and in their place are
importance and relevance. Models are built keeping on information that is
relevant and important to a decision maker (and not a statistician). All models
are stated in publicly verifiable fashion, as predictions. All models must
undergo a verification process before any trust is put into them.Comment: 14 pages, 4 figure
The Crisis Of Evidence: Why Probability And Statistics Cannot Discover Cause
Probability models are only useful at explaining the uncertainty of what we
do not know, and should never be used to say what we already know. Probability
and statistical models are useless at discerning cause. Classical statistical
procedures, in both their frequentist and Bayesian implementations are, falsely
imply they can speak about cause. No hypothesis test, or Bayes factor, should
ever be used again. Even assuming we know the cause or partial cause for some
set of observations, reporting via relative risk exagerates the certainty we
have in the future, often by a lot. This over-certainty is made much worse when
parametetric and not predictive methods are used. Unfortunately, predictive
methods are rarely used; and even when they are, cause must still be an
assumption, meaning (again) certainty in our scientific pronouncements is too
high.Comment: Corrected typos; 22 pages, 5 figure
On the non-arbitrary assignment of equi-probable priors
How to form priors that do not seem artificial or arbitrary is a central
question in Bayesian statistics. The case of forming a prior on the truth of a
proposition for which there is no evidence, and the definte evidence that the
event can happen in a finite set of ways, is detailed. The truth of a
propostion of this kind is frequently assigned a prior of 0.5 via arguments of
ignorance, randomness, the Principle of Indiffernce, the Principal Principal,
or by other methods. These are all shown to be flawed. The statistical
syllogism introduced by Williams in 1947 is shown to fix the problems that the
other arguments have. An example in the context of model selection is given.Comment: 23 page
On Probability Leakage
The probability leakage of model M with respect to evidence E is defined.
Probability leakage is a kind of model error. It occurs when M implies that
events , which are impossible given E, have positive probability. Leakage
does not imply model falsification. Models with probability leakage cannot be
calibrated empirically. Regression models, which are ubiquitous in statistical
practice, often evince probability leakage.Comment: 1 figur
It is Time to Stop Teaching Frequentism to Non-statisticians
We should cease teaching frequentist statistics to undergraduates and switch
to Bayes. Doing so will reduce the amount of confusion and over-certainty rife
among users of statistics
MCMC Inference for a Model with Sampling Bias: An Illustration using SAGE data
This paper explores Bayesian inference for a biased sampling model in
situations where the population of interest cannot be sampled directly, but
rather through an indirect and inherently biased method. Observations are
viewed as being the result of a multinomial sampling process from a tagged
population which is, in turn, a biased sample from the original population of
interest. This paper presents several Gibbs Sampling techniques to estimate the
joint posterior distribution of the original population based on the observed
counts of the tagged population. These algorithms efficiently sample from the
joint posterior distribution of a very large multinomial parameter vector.
Samples from this method can be used to generate both joint and marginal
posterior inferences. We also present an iterative optimization procedure based
upon the conditional distributions of the Gibbs Sampler which directly computes
the mode of the posterior distribution. To illustrate our approach, we apply it
to a tagged population of messanger RNAs (mRNA) generated using a common
high-throughput technique, Serial Analysis of Gene Expression (SAGE).
Inferences for the mRNA expression levels in the yeast Saccharomyces cerevisiae
are reported
Uncertainty in the MAN Data Calibration & Trend Estimates
We investigate trend identification in the LML and MAN atmospheric ammonia
data. The signals are mixed in the LML data, with just as many positive,
negative, and no trends found. The start date for trend identification is
crucial, with the trends claimed changing sign and significance depending on
the start date. The MAN data is calibrated to the LML data. This calibration
introduces uncertainty never heretofore accounted for in any downstream
analysis, such as identifying trends. We introduce a method to do this, and
find that the number of trends identified in the MAN data drop by about 50%.
The missing data at MAN stations is also imputed; we show that this imputation
again changes the number of trends identified, with more positive and fewer
significant trends claimed. The sign and significance of the trends identified
in the MAN data change with the introduction of the calibration and then again
with the imputation. The conclusion is that great over-certainty exists in
current methods of trend identification.Comment: 32 pages, 12 figure
Fixes to the Ryden & McNeil Ammonia Flux Model
We propose two simple fixes to the Ryden and McNeil ammonia flux model. These
are necessary to prevent estimates from becoming unphysical, which very often
happens and which has not yet been noted in the literature. The first fix is to
constrain the limits of certain of the model's parameters; without this limit,
estimates from the model are seen to produce absurd values. The second is to
estimate a point at which additional contributions of atmospheric ammonia are
not part of a planned expert but are the result of natural background levels.
These two fixes produce results that are everywhere physical. Some experiment
types, such as surface broadcast, are not well cast in the Ryden and McNeil
scheme, and lead to over-estimates of atmospheric ammonia
- β¦